25 research outputs found
Vision-based ego-lane analysis system : dataset and algorithms
A detecção e análise da faixa de trânsito são tarefas importantes e desafiadoras em sistemas avançados de assistência ao motorista e direção autônoma. Essas tarefas são necessárias para auxiliar veÃculos autônomos e semi-autônomos a operarem com segurança. A queda no custo dos sensores de visão e os avanços em hardware embarcado impulsionaram as pesquisas relacionadas a faixa de trânsito –detecção, estimativa, rastreamento, etc. – nas últimas duas décadas. O interesse nesse tópico aumentou ainda mais com a demanda por sistemas avançados de assistência ao motorista (ADAS) e carros autônomos. Embora amplamente estudado de forma independente, ainda há necessidade de estudos que propõem uma solução combinada para os vários problemas relacionados a faixa do veÃculo, tal como aviso de saÃda de faixa (LDW), detecção de troca de faixa, classificação do tipo de linhas de divisão de fluxo (LMT), detecção e classificação de inscrições no pavimento, e detecção da presença de faixas ajdacentes. Esse trabalho propõe um sistema de análise da faixa do veÃculo (ELAS) em tempo real capaz de estimar a posição da faixa do veÃculo, classificar as linhas de divisão de fluxo e inscrições na faixa, realizar aviso de saÃda de faixa e detectar eventos de troca de faixa. O sistema proposto, baseado em visão, funciona em
uma sequência temporal de imagens. CaracterÃsticas das marcações de faixa são extraÃdas tanto na perspectiva original quanto em images mapeadas para a vista aérea, que então são combinadas para aumentar a robustez. A estimativa final da faixa é modelada como uma spline usando uma combinação de métodos (linhas de Hough, filtro de Kalman e filtro de partÃculas). Baseado na faixa estimada, todos os
outros eventos são detectados. Além disso, o sistema proposto foi integrado para experimentação em um sistema para carros autônomos que está sendo desenvolvido pelo Laboratório de Computação de Alto Desempenho (LCAD) da Universidade Federal do EspÃrito Santo (UFES). Para validar os algorÃtmos propostos e cobrir a falta de base de dados para essas tarefas na literatura, uma nova base dados com mais de 20 cenas diferentes (com mais de 15.000 imagens) e considerando uma variedade de cenários (estrada urbana, rodovias, tráfego, sombras, etc.) foi criada. Essa base de dados foi manualmente
anotada e disponilizada publicamente para possibilitar a avaliação de diversos eventos que são de interesse para a comunidade de pesquisa (i.e. estimativa, mudança e centralização da faixa; inscrições no pavimento; cruzamentos; tipos de linhas de divisão de fluxo; faixas de pedestre e faixas adjacentes). Além disso, o sistema também foi validado qualitativamente com base na integração com o veÃculo autônomo. O sistema alcançou altas taxas de detecção em todos os eventos do mundo real e provou estar pronto para aplicações em tempo real.Lane detection and analysis are important and challenging tasks in advanced driver assistance systems and autonomous driving. These tasks are required in order to help autonomous and semi-autonomous vehicles to operate safely. Decreasing costs of vision sensors and advances in embedded hardware boosted lane related
research – detection, estimation, tracking, etc. – in the past two decades. The interest in this topic has increased even more with the demand for advanced driver assistance systems (ADAS) and self-driving cars. Although extensively studied independently,
there is still need for studies that propose a combined solution for the multiple problems related to the ego-lane, such as lane departure warning (LDW), lane change detection, lane marking type (LMT) classification, road markings detection and classification, and detection of adjacent lanes presence. This work proposes a real-time Ego-Lane Analysis System (ELAS) capable of estimating ego-lane position, classifying LMTs and road markings, performing LDW and detecting lane change events. The proposed vision-based system works on
a temporal sequence of images. Lane marking features are extracted in perspective and Inverse Perspective Mapping (IPM) images that are combined to increase robustness. The final estimated lane is modeled as a spline using a combination of methods (Hough lines, Kalman filter and Particle filter). Based on the estimated lane, all other events are detected. Moreover, the proposed system was integrated for experimentation into an autonomous car that is being developed by the High Performance Computing Laboratory of the Universidade Federal do EspÃrito Santo. To validate the proposed algorithms and cover the lack of lane datasets in the literature, a new dataset with more than 20 different scenes (in more than 15,000 frames) and considering a variety of scenarios (urban road, highways, traffic, shadows, etc.) was created. The dataset was manually annotated and made publicly
available to enable evaluation of several events that are of interest for the research community (i.e. lane estimation, change, and centering; road markings; intersections; LMTs; crosswalks and adjacent lanes). Furthermore, the system was also validated qualitatively based on the integration with the autonomous vehicle. ELAS achieved high detection rates in all real-world events and proved to be ready for real-time applications.FAPE
Budget-Aware Adapters for Multi-Domain Learning
Multi-Domain Learning (MDL) refers to the problem of learning a set of models
derived from a common deep architecture, each one specialized to perform a task
in a certain domain (e.g., photos, sketches, paintings). This paper tackles MDL
with a particular interest in obtaining domain-specific models with an
adjustable budget in terms of the number of network parameters and
computational complexity. Our intuition is that, as in real applications the
number of domains and tasks can be very large, an effective MDL approach should
not only focus on accuracy but also on having as few parameters as possible. To
implement this idea we derive specialized deep models for each domain by
adapting a pre-trained architecture but, differently from other methods, we
propose a novel strategy to automatically adjust the computational complexity
of the network. To this aim, we introduce Budget-Aware Adapters that select the
most relevant feature channels to better handle data from a novel domain. Some
constraints on the number of active switches are imposed in order to obtain a
network respecting the desired complexity budget. Experimentally, we show that
our approach leads to recognition accuracy competitive with state-of-the-art
approaches but with much lighter networks both in terms of storage and
computation.Comment: ICCV 201
Budget-Aware Pruning for Multi-Domain Learning
Deep learning has achieved state-of-the-art performance on several computer
vision tasks and domains. Nevertheless, it still has a high computational cost
and demands a significant amount of parameters. Such requirements hinder the
use in resource-limited environments and demand both software and hardware
optimization. Another limitation is that deep models are usually specialized
into a single domain or task, requiring them to learn and store new parameters
for each new one. Multi-Domain Learning (MDL) attempts to solve this problem by
learning a single model that is capable of performing well in multiple domains.
Nevertheless, the models are usually larger than the baseline for a single
domain. This work tackles both of these problems: our objective is to prune
models capable of handling multiple domains according to a user defined budget,
making them more computationally affordable while keeping a similar
classification performance. We achieve this by encouraging all domains to use a
similar subset of filters from the baseline model, up to the amount defined by
the user's budget. Then, filters that are not used by any domain are pruned
from the network. The proposed approach innovates by better adapting to
resource-limited devices while, to our knowledge, being the only work that is
capable of handling multiple domains at test time with fewer parameters and
lower computational complexity than the baseline model for a single domain
Budget-Aware Pruning: Handling Multiple Domains with Less Parameters
Deep learning has achieved state-of-the-art performance on several computer
vision tasks and domains. Nevertheless, it still has a high computational cost
and demands a significant amount of parameters. Such requirements hinder the
use in resource-limited environments and demand both software and hardware
optimization. Another limitation is that deep models are usually specialized
into a single domain or task, requiring them to learn and store new parameters
for each new one. Multi-Domain Learning (MDL) attempts to solve this problem by
learning a single model that is capable of performing well in multiple domains.
Nevertheless, the models are usually larger than the baseline for a single
domain. This work tackles both of these problems: our objective is to prune
models capable of handling multiple domains according to a user-defined budget,
making them more computationally affordable while keeping a similar
classification performance. We achieve this by encouraging all domains to use a
similar subset of filters from the baseline model, up to the amount defined by
the user's budget. Then, filters that are not used by any domain are pruned
from the network. The proposed approach innovates by better adapting to
resource-limited devices while, to our knowledge, being the only work that
handles multiple domains at test time with fewer parameters and lower
computational complexity than the baseline model for a single domain.Comment: arXiv admin note: substantial text overlap with arXiv:2210.0810
Copycat CNN: Stealing Knowledge by Persuading Confession with Random Non-Labeled Data
In the past few years, Convolutional Neural Networks (CNNs) have been
achieving state-of-the-art performance on a variety of problems. Many companies
employ resources and money to generate these models and provide them as an API,
therefore it is in their best interest to protect them, i.e., to avoid that
someone else copies them. Recent studies revealed that state-of-the-art CNNs
are vulnerable to adversarial examples attacks, and this weakness indicates
that CNNs do not need to operate in the problem domain (PD). Therefore, we
hypothesize that they also do not need to be trained with examples of the PD in
order to operate in it.
Given these facts, in this paper, we investigate if a target black-box CNN
can be copied by persuading it to confess its knowledge through random
non-labeled data. The copy is two-fold: i) the target network is queried with
random data and its predictions are used to create a fake dataset with the
knowledge of the network; and ii) a copycat network is trained with the fake
dataset and should be able to achieve similar performance as the target
network.
This hypothesis was evaluated locally in three problems (facial expression,
object, and crosswalk classification) and against a cloud-based API. In the
copy attacks, images from both non-problem domain and PD were used. All copycat
networks achieved at least 93.7% of the performance of the original models with
non-problem domain data, and at least 98.6% using additional data from the PD.
Additionally, the copycat CNN successfully copied at least 97.3% of the
performance of the Microsoft Azure Emotion API. Our results show that it is
possible to create a copycat CNN by simply querying a target network as
black-box with random non-labeled data.Comment: 8 pages, 3 figures, accepted by IJCNN 201